Optimization of Voice/Music Detection in Sound Data
نویسندگان
چکیده
Automatic voice/music segment detection is expected for various applications. For the general applications of voice recognition and dictation, input voice for the recognition is needed to detect and remove music section automatically. In order to detect voice and music segments, where sound data contains both voice and music, this paper proposes weighted Block Cepstrum Flux (BCF) and optimizes the weight vector using discriminative training technique. This paper also discusses the effectiveness of the frequency axis weighting in calculating Cepstrum Flux and BCF. Here, frequency axis weighting is carried out by the modification of LPC Cepstrum distance calculation. The experimental results shows the detection error rate of the original BCF is 11.56% and the error rate of the weighted BCF with the low-frequency weighting for closed data is 9.08 %, and 10.48 % for open data. This rensult shows the effectiveness of both time and frequency axis weighting in BCF calculation for detection between voice and music.
منابع مشابه
Detecting key features in popular music: case study – singing voice detection
Detecting distinct features in modern pop music is an important problem that can have significant applications in areas such as multimedia entertainment. They can be used, for example, to give a visually coherent representation of the sound. We propose to integrate a singing voice detector with a multimedia, multi-touch game where the user has to perform simple tasks at certain key points in th...
متن کاملDetecting key features in popular music: case study – voice detection
Detecting distinct features in modern pop music is an important problem that can have significant applications in areas such as multimedia entertainment. They can be used, for example, to give a visually coherent representation of the sound. The work developed for this project is meant to be used in the context of a multimedia, multi-touch game where the user has to perform simple tasks at the ...
متن کاملData-Intensive Sound Acquisition System with Large-scale Microphone Array
We propose a microphone array network that realizes ubiquitous sound acquisition. Several nodes with 16 microphones are connected to form a novel huge sound acquisition system, which carries out voice activity detection (VAD), sound source localization, and sound enhancement. The three operations are distributed among nodes. Using the distributed network, we produce a lowtraffic data-intensive ...
متن کاملSystem and Method for Automatic Singer Identification
singer identification, singing voice detection, music analysis, music classification, music retrieval, audio browsing, music database management The singer's information is essential in organizing, browsing and retrieving music collections. In this technical report, a system for automatic singer identification is developed which recognizes the singer of a song by analyzing the music signal. Mea...
متن کاملSinging Voice Detection in North Indian Classical Music
Singing voice detection is essential for contentbased applications such as those involving melody extraction and singer identification. This article is concerned with the accurate detection of singing voice phrases in north Indian classical vocal music. The component sound sources in such music fit into a typical framework (voice, rhythm and drone). We have used this a-priori knowledge to enhan...
متن کامل